Revisiting corpus creation and analysis tools for translation tasks
نویسندگان
چکیده
منابع مشابه
The Swedish-Turkish Parallel Corpus and Tools for its Creation
We present a Swedish-Turkish parallel corpus and the automatic annotation procedure with tools that we have been using in order to build the corpus efficiently. The method presented here can be transferred directly to build other parallel corpora.
متن کاملOpen Source Corpus Analysis Tools for Malay
Tokenisers, lemmatisers and POS taggers are vital to the linguistic and digital furtherment of any language. In this paper, we present an open source toolkit for Malay incorporating a word and sentence tokeniser, a lemmatiser and a partial POS tagger, based on heavy reuse of pre-existing language resources. We outline the software architecture of each component, and present an evaluation of eac...
متن کاملCorpus Analysis Tools for Computational Hook Discovery
Compared to studies with symbolic music data, advances in music description from audio have overwhelmingly focused on ground truth reconstruction and maximizing prediction accuracy, with only a small fraction of studies using audio description to gain insight into musical data. We present a strategy for the corpus analysis of audio data that is inspired by the FANTASTIC toolbox and optimized fo...
متن کاملTools for End-User Creation and Customization of Interfaces for Information Management Tasks
Information based tasks rely on software applications that allow users to interact with information in some pre-defined manner deemed appropriate by the application developer or information/content provider. Whereas such an approach facilitates one way of working with the information, it does not (and cannot) take into account the unique needs of the user, e.g., the particular content of intere...
متن کاملSpam Corpus Creation for TREC
TREC’s Spam Filtering Track (Cormack & Lynam, 2005) introduces a standard testing framework that is designed to model a spam filter’s usage as closely as possible, to measure quantities that reflect the filter’s effectiveness for its intended purpose, and to yield repeatable (i.e. controlled and statistically valid) results. The TREC Spam Filter Evaluation Toolkit is free software that, given a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Cadernos de Tradução
سال: 2016
ISSN: 2175-7968
DOI: 10.5007/2175-7968.2016v36nesp1p62